Git and Github for Remote Collaboration
Objective
- Maintaining code for scientific collaboration as a main objective.
- Effective ways to store, track changes, and enable collaboration on code.
Why Github?
- it is the most used for version control and collaboration.
- integrates communication features
- engage and collaborate on code, but also publish info to a webpage.
The difference between Git and Github
- Git is the version control system that enables all the collaborative tools available on Github.
- Git launched in 2005.
- Basic concepts of git: commit, push, pull, checkout.
- Git operations through the terminal.
- Github is web-based: functionality available to users less familiar with software development
Github, GitLab, and BitBucket
They are all similar and they provide hosting services, which is basically a home for your project on the internet.
It’s like having a DropBox or GoogleDrive but for git-based projects.
This allows other people to see your stuff, synchronize it, and contribute.
Some Github features
Well-designed user interface
Issues originally a bug tracker but highly underutilized in our fields
R and Github integration is nicer due to the active R package development community.
An intro on this can be found here and here
Step-by-step process:
- Create remote repo and sync with files and directory locally.
- Modify files locally or remotely
- Frequently ‘commit’ the changes with a description of the changes
- Synchronize commits with Github (push and pull)
The repository contains: the files, the modifications and the description of these changes. Others can download and synchronize these files with their changes (‘clone’)
Practical ways to use:
- Storage: just because you can version control something, doesn’t mean that you should.
- plain text-based documents. Git stores the original file first, and then takes up very little space by only tracking the differences between versions.
- Things not to version control are large data files that never change.
- If code is fully reproducible, you shouldn’t need to store the output.
- Better ‘storage’ for long term - zenodo
Practical ways to use: (cont.)
- Project continuity
- So many researchers hold limited-term appointments
- Keeping docs on personal computers only does’t work for file transfer when people move on.
- Easier code and data handover - stop it with the emails
- Assigning tasks
Practical ways to use: (cont.)
- Project management: useful for highly collaborative research.
- Github Issues for discrete tasks and sub-tasks: identify, assign, categorize, keep track/history.
- Github Discussions: message board for conversation
- Github Projects real-time tracking of project priorities and status
- Examples:
Intro Resources
For the R user, best simple straightforward resource out there is Happy git with R
Github itself has a dedicated section for learning in the docs and in particular, the Hello World tutorial will get you creating a repo, managing a branch and merging a pull request.
Branches and pull requests
- Create Branch to make a change.
- Commit changes to the new branch.
- Open Pull request to merge the changes to main branch.
- Optional and recommended: delete branch
![]()
source https://www.nobledesktop.com/learn/git/git-branches